Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix STAC writing and add AWS writing functions #53

Merged
merged 49 commits into from
Sep 25, 2024
Merged

Conversation

alexgleith
Copy link
Contributor

@alexgleith alexgleith commented Feb 21, 2024

Refactor STAC writing, as I don't think it was working before. Was only writing a single asset.

Fixes path returns from the writer.

Also implements an S3 writer, including all the existence checks and so on.

I think this is ready for merging, although I could clean up the commits a bit!

@alexgleith alexgleith changed the base branch from tweaks-to-new-version to develop February 21, 2024 09:21
@alexgleith
Copy link
Contributor Author

Ok, rebased and cleaned up the commits.

@alexgleith alexgleith changed the title Add first draft AWS writing functions Fix STAC writing and add AWS writing functions Feb 21, 2024
Base automatically changed from develop to main February 22, 2024 21:16
dep_tools/searchers.py Outdated Show resolved Hide resolved
dep_tools/utils.py Outdated Show resolved Hide resolved
stac_writer = StacWriter(itempath, write_stac_function) if write_stac else None
super().__init__(itempath, pre_processor, cog_writer, stac_writer)
stac_writer = (
StacWriter(itempath, write_stac_function, **kwargs) if write_stac else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and below) is not going to work. We can't send the same kwargs to three places. I wrote DepWriter to deal with this situation. I was on the fence about writing this wrapper, maybe it wasn't the best idea

dep_tools/writers.py Outdated Show resolved Hide resolved
dep_tools/searchers.py Outdated Show resolved Hide resolved
@@ -102,8 +102,8 @@ def bbox_across_180(region: GeoDataFrame) -> BBOX | tuple[BBOX, BBOX]:
)
if bbox_crosses_antimeridian:
# Split into two bboxes across the antimeridian
xmax_ll, ymin_ll = bbox[0], bbox[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the bbox is e.g. [-179, -1, 179, 1] then -179 is east of 179 and therefore the xmax.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying it's perfect, can you say where this was causing issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bounding boxes are (in the tools we use) [minx, miny, maxx, maxy]

So we should remain consistent with that.

I think one of your tests had a geometry where effectively the bbox was world wrapping, so doing xmax_ll, ymin_ll = bbox[0], bbox[1] worked, but it failed on "normal" bboxes.

I tested all this a fair bit when I wrote it. I think including the antimeridian library makes it more robust too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in that format ([minx, miny, maxx, maxy]), in "real world" coordinates. It comes down to whether the smaller number is the minimum, or the western most point is the minimum. This selects the latter. We could write it either way.

Please show me an example of where this code fails. Your fix is not passing existing tests which the existing version worked for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the bbox is e.g. [-179, -1, 179, 1] then -179 is east of 179 and therefore the xmax.

Ok, I see what you're saying, but I think this is the core of the issue, i.e., it's ambiguous. It could be a valid BBOX, which spans nearly the entire world, and it's really more valid to think of it as that, since the convention is to say minx, miny, maxx, maxy.

Does that make sense? It's either a bounding box going from far west to far east, or a bounding box going just across the antimeridian... and we should assume the former, I think.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hence my comment "these are pacific specific tests"...i.e. I was inferring the latter

@@ -24,13 +23,14 @@ def test_shift_negative_longitudes_noncrossing_linestring():

def test_bbox_across_180_crossing():
crossing_polygon = Polygon(
[(179.0, 1.0), (-179.0, 1.0), (-179.0, -1.0), (179.0, -1.0), (179.0, 1.0)]
[(179.0, 1.0), (179.0, -1.0), (-179.0, -1.0), (-179.0, 1.0), (179.0, 1.0)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reason for this change? I was trying to send something that was incorrectly wound (clockwise) I think

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The https://pypi.org/project/antimeridian/ package throws warnings for incorrectly wound polygons.

I don't think we should be handling bad geometry, we should just make sure we don't generate it.

Is the incorrect geometry something we need to handle?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should, because it's out there. Why not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sure.

We do need to handle bad data, sure.

Let's do everything we possibly can to ensure we don't produce our own bad data, and then we don't need to write our own stuff that handles it! haha (Not saying we are creating bad data now, of course!)

@alexgleith
Copy link
Contributor Author

Hey @jessjaco the only failing test now is this one:

FAILED tests/test_landsat_mspc_stac_search_fixes.py::test_search_for_stac_items_with_bad_geoms - assert 12 == 24

And I haven't fixed it yet because I don't know what the correct answer is. Presumably the result should be 24, but as it does an actual search of the MSPC API, it's hard to know, as they might be returning less items.

@jessjaco
Copy link
Collaborator

jessjaco commented Mar 28, 2024

Hey @jessjaco the only failing test now is this one:

FAILED tests/test_landsat_mspc_stac_search_fixes.py::test_search_for_stac_items_with_bad_geoms - assert 12 == 24

And I haven't fixed it yet because I don't know what the correct answer is. Presumably the result should be 24, but as it does an actual search of the MSPC API, it's hard to know, as they might be returning less items.

It's from your own issue, as I commented in the code: microsoft/PlanetaryComputer#296

edit: Ok, I admit that 24 is a magic number not inferred from the issue, but this is the situation I was trying to repair. I welcome a more robust test.

@alexgleith
Copy link
Contributor Author

It's from your own issue, as I commented in the code

Sorry, not trying to blame here! Just saying I didn't fix it by making it 12 because I wanted to investigate :)

@jessjaco
Copy link
Collaborator

It's from your own issue, as I commented in the code

Sorry, not trying to blame here! Just saying I didn't fix it by making it 12 because I wanted to investigate :)

Now that I look at it, why did you make it 24?

@alexgleith
Copy link
Contributor Author

Now that I look at it, why did you make it 24?

Oh dear, it does appear I'm to blame!

@jessjaco jessjaco merged commit f9f5e2d into main Sep 25, 2024
2 checks passed
@jessjaco jessjaco deleted the add-s3-writer branch September 25, 2024 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants